Skip to content

Health report, report diff fixes#1254

Merged
milanmajchrak merged 23 commits into
dtq-devfrom
ufal/health-report-checks-fix
Jun 12, 2026
Merged

Health report, report diff fixes#1254
milanmajchrak merged 23 commits into
dtq-devfrom
ufal/health-report-checks-fix

Conversation

@Kasinhou

@Kasinhou Kasinhou commented Feb 23, 2026

Copy link
Copy Markdown

Problem description

Followed issues:
https://github.com/dataquest-dev/dspace-customers/issues/430
#1250

Manual Testing (if applicable)

Copilot review

  • Requested review from Copilot

Summary by CodeRabbit

  • New Features

    • Select multiple checks with repeatable -c for health reports
    • List and compare reports by report ID in report-diff; enhanced summary, key-changes table, and skipped-checks reporting
  • Improvements

    • Unified help option (-h) and clearer CLI semantics; report output now includes persisted report ID and writes to a report file (-r)
    • Human-readable size diffs, improved email/error handling, and more robust option validation
  • Tests

    • Expanded integration tests for CLI behaviors, validations, listing, and comparisons
  • Refactor

    • Removed legacy report runner and deprecated report-filtering/query paths
  • Chores

    • Removed legacy launcher healthcheck command

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Health Report and Report Diff CLI tooling in CLARIN-DSpace to improve comparison correctness when reports contain different check selections and to align CLI option handling/help output.

Changes:

  • Remove the legacy healthcheck CLI command from launcher.xml and rely on the Spring script-service based health-report.
  • Update report-diff to normalize report JSON to the intersection of check names, add a “Skipped Checks” section, and make key-field mappings independent of check ordering via selector-based paths.
  • Standardize help flags to -h/--help and update tests/fixtures accordingly.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
dspace/config/launcher.xml Removes legacy healthcheck command entry.
dspace-api/src/test/java/org/dspace/scripts/ReportDiffIT.java Updates help flag usage and adds coverage for skipped checks / name-based comparisons.
dspace-api/src/main/resources/report-diff-fields.json Switches from index-based paths to selector-by-check-name paths.
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiffScriptConfiguration.java Changes -i to -h and clarifies -c semantics for filtering comparisons.
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Implements intersection normalization, skipped-check reporting, and selector-based field resolution.
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReportScriptConfiguration.java Changes help flag to -h, supports multiple -c values, and renames output option to -r.
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java Implements multi-check -c parsing, help renaming, and output option rename.

Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace/config/launcher.xml
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
@Paurikova2

Paurikova2 commented Feb 25, 2026

Copy link
Copy Markdown
Problem (from issue) Class + Method Changed Test Name
1328 -i/--info renamed to -h/--help in health-report HealthReport#setup, HealthReport#internalRun, HealthReport#printHelp HealthReportIT#testHelpOption
1328 -h/--help option declared for health-report CLI HealthReportScriptConfiguration#getOptions HealthReportIT#testHelpOption
1328 -o/--output renamed to -r/--report; field fileNamereportFile HealthReport#setup, HealthReport#internalRun HealthReportIT#testReportFileSaved
1328 -r/--report option declared in CLI options HealthReportScriptConfiguration#getOptions HealthReportIT#testReportFileSaved
1328 -c accepts multiple values (List<Integer> replaces single int); uses getOptionValues HealthReport#setup HealthReportIT#testMultipleChecks, HealthReportIT#testLicenseCheck
1328 -c declared as multi-value with setArgs(Integer.MAX_VALUE) HealthReportScriptConfiguration#getOptions HealthReportIT#testMultipleChecks
1328 Validate -c: out-of-range index → logError + throw ParseException HealthReport#setup HealthReportIT#testInvalidCheckOutOfRange
1328 Validate -c: non-integer value → logError + throw ParseException HealthReport#setup HealthReportIT#testInvalidCheckNonInteger
1328 Validate -f: value ≤ 0 → logError + throw ParseException HealthReport#setup HealthReportIT#testInvalidForDaysZero
1328 Validate -f: non-integer value → logError + throw ParseException HealthReport#setup HealthReportIT#testInvalidForDaysNonInteger
1328 printCommandlineOptions updated to use getOptionValues for multi-value -c HealthReport#printCommandlineOptions HealthReportIT#testMultipleChecks
1328 -i/--info renamed to -h/--help in report-diff ReportDiff#setup, ReportDiff#internalRun, ReportDiff#printHelp ReportDiffIT#testHelpInformation
1328 -h/--help option declared for report-diff CLI ReportDiffScriptConfiguration#getOptions ReportDiffIT#testHelpInformation
1328 Legacy healthcheck CLI command removed from launcher dspace/config/launcher.xml
1334 Reports with different check lists produced null values — normalize both to intersection of check names ReportDiff#normalizeReportsToIntersection ReportDiffIT#testSkippedChecksSection, ReportDiffIT#testIntersectionComparisonNoNullsForCommonChecks
1334 Checks absent from one report silently ignored — "Skipped Checks" section added ReportDiff#generateReportComparison ReportDiffIT#testSkippedChecksSection, ReportDiffIT#testIntersectionComparisonNoNullsForCommonChecks
1334 -c in report-diff filtered by check name, not position, after normalization ReportDiff#normalizeReportsToIntersection ReportDiffIT#testCompareSpecificCheck
1334 report-diff-fields.json paths used numeric indices (/checks/0/…) breaking resolution when check order changed — replaced with name selectors (/checks/[name=Item summary]/…) report-diff-fields.json ReportDiffIT#testEnhancedKeyChangesTable, ReportDiffIT#testProfessionalReportFormat, ReportDiffIT#testSizeDifferenceFormatting
1334 resolveFieldPath did not support name-selector syntax — added XPath-like [attr=value] segment resolution ReportDiff#resolveFieldPath, ReportDiff#splitPathSegments ReportDiffIT#testEnhancedKeyChangesTable, ReportDiffIT#testProfessionalReportFormat, ReportDiffIT#testSizeDifferenceFormatting
1334 Key Changes table showed null for fields when one report had fewer checks — skip fields missing in either normalized report ReportDiff#generateEnhancedKeyChangesTable ReportDiffIT#testIntersectionComparisonNoNullsForCommonChecks, ReportDiffIT#testEnhancedKeyChangesTable

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/test/java/org/dspace/scripts/HealthReportIT.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
@Paurikova2 Paurikova2 linked an issue Mar 5, 2026 that may be closed by this pull request

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java:385

  • displayReportDates() also assumes the newest report is at the end of findAll() (reverses index via allReports.size() - 1 - i). Since findAll() is unordered, the listing and max-entry truncation can be inconsistent. Prefer querying reports already ordered by lastModified/id DESC and applying the limit at the query level.
            List<ReportResult> allReports = reportResultService.findAll(context);
            // Determine how many reports to process, respecting maxEntries if it's within valid range
            long limitCount = (maxEntries > 0 && maxEntries < allReports.size()) ? maxEntries : allReports.size();
            Map<String, List<DateWithArgs>> reportDatesMap = new HashMap<>();
            for (long i = 0; i < limitCount; i++) {
                // the newest report is at the end of the list, so we reverse the index
                ReportResult report = allReports.get(allReports.size() - 1 - (int) i);
                String formattedDate = FORMATTER.format(report.getLastModified()

dspace-api/src/test/java/org/dspace/scripts/HealthReportIT.java:272

  • This test picks the "latest" ReportResult via findAll(context).get(size - 1), but ReportResultDAO inherits AbstractHibernateDAO.findAll() which has no ORDER BY. This can make the test flaky if the DB returns rows in a different order or other reports exist. Sort by lastModified/id explicitly (or add a dedicated query for the most recent report) before asserting on latest.getArgs().
                context.reloadEntity(eperson);
                List<ReportResult> allReports = reportResultService.findAll(context);
                ReportResult latest = allReports.get(allReports.size() - 1);

                assertThat(handler.getErrorMessages(), empty());
                assertThat(latest.getArgs(), containsString("-c: 2"));
                assertThat(latest.getArgs(), containsString("-c: 3"));

Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/test/java/org/dspace/scripts/HealthReportIT.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 983764fc-55c6-499a-b580-0077024aa2d5

📥 Commits

Reviewing files that changed from the base of the PR and between 4dbc684 and 1b538e4.

📒 Files selected for processing (3)
  • dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java
  • dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
  • dspace-api/src/test/java/org/dspace/scripts/ReportDiffIT.java
🚧 Files skipped from review as they are similar to previous changes (3)
  • dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java
  • dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
  • dspace-api/src/test/java/org/dspace/scripts/ReportDiffIT.java

📝 Walkthrough

Walkthrough

The PR modernizes DSpace health reporting and comparison scripts: HealthReport supports repeatable -c selections, -r output files, and persists ID-prefixed JSON reports; ReportDiff uses report IDs (-s/-t) with listing, defaults, intersection-based normalization, selector field resolution, and improved key-change formatting. Supporting DAOs, config, REST wiring, and tests updated.

Changes

HealthReport CLI and Report Persistence

Layer / File(s) Summary
HealthReport CLI option definitions
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReportScriptConfiguration.java
CLI option configuration updated to support -h/--help, repeatable -c/--check, -f/--for, -r/--report, and -e/--email using Option.builder.
HealthReport CLI parsing and validation
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java (lines 102–159)
setup() moved option parsing here: early-exit on -h, collect/validate multi-value -c against number of checks, validate positive integer -f, and read -r into reportFile.
HealthReport execution, filtering, and persistence
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java (lines 170–224)
Check loop filters by specificChecks, builds text and JSON payloads, persists ReportResult, constructs finalReport prefixed with stored ID, writes finalReport to reportFile, and sends finalReport via email with explicit exception logging.
HealthReport help and option display
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java (lines 237–287)
printHelp() and printCommandlineOptions() rewritten to document -h/-r and to deduplicate option keys and print all values for repeatable options.
HealthReport integration tests
dspace-api/src/test/java/org/dspace/scripts/HealthReportIT.java
Updated default header expectation and added tests for help, multiple -c usage, invalid -c/-f validation, report file saving, and storing args containing repeated -c values.

ReportDiff CLI Modernization and Comparison Logic

Layer / File(s) Summary
ReportDiff CLI option definitions
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiffScriptConfiguration.java
Options redefined: -h/--help, -l/--list with -m/--max, repeatable -c/--check, -s/--source, -t/--target, and -e/--email.
ReportDiff argument parsing and state setup
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (lines 171–272)
setup()/internalRun() parse new options, branch on help/list, default missing IDs via latest-reports logic, validate report IDs, and route to compareReports.
ReportDiff report listing and display
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (lines 373–451)
displayReportDates() now lists "Available Reports Summary" with `ID
ReportDiff comparison setup and normalization
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (lines 586–711)
compareReports fetches reports by ID, normalizeReportsToIntersection computes common check-name intersection, optionally filters by -c, and returns normalized JSON plus skipped-check lists.
ReportDiff comparison output generation
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (lines 726–830)
generateReportComparison now includes source/target IDs in the executive summary, early-returns when no common checks (emitting "Skipped Checks"), otherwise builds enhanced key-changes table and detailed changelog, and short-circuits when no differences.
ReportDiff key changes table and formatting
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (lines 985–1215)
Added selector-based field resolution (bracket attribute selectors and array indices), byte-format helpers, revised difference calculation (Added/Removed/Changed), and ID-based table headers. Legacy JSON path helpers removed.
ReportDiff integration tests
dspace-api/src/test/java/org/dspace/scripts/ReportDiffIT.java
Help/listing tests adapted to -h/-l/-m; comparisons moved to -s/-t IDs; many fixtures and assertions updated for named check paths, size-delta formatting, skipped-checks, missing-field robustness, and CLI fallback/warning behaviors.

ReportResult Service API Cleanup

Layer / File(s) Summary
Service and DAO interface cleanup
dspace-api/src/main/java/org/dspace/content/service/ReportResultService.java, dspace-api/src/main/java/org/dspace/content/dao/ReportResultDAO.java
Removed findByLastModifiedAndCheckType(Context, Date, int) from interfaces; findByLastModified(Context, Date) remains.
Service implementation cleanup
dspace-api/src/main/java/org/dspace/content/ReportResultServiceImpl.java, dspace-api/src/main/java/org/dspace/content/dao/impl/ReportResultDAOImpl.java
Removed corresponding implementations of the type-filtered lookup method.

Configuration and Integration Updates

Layer / File(s) Summary
Report diff field configuration
dspace-api/src/main/resources/report-diff-fields.json
fieldMappings and fieldOrder changed from numeric check indices (/checks/0) to named selector paths (/checks/[name=...]); directory size labels no longer include “(bytes)”.
REST script execution integration
dspace-server-webapp/src/main/java/org/dspace/app/rest/repository/ScriptRestRepository.java
runDSpaceScript captures initialize() return and early-exits when StepResult.Exit, invoking handler completion without scheduling long-running work.
Launcher configuration cleanup
dspace/config/launcher.xml
Removed legacy healthcheck command entry from launcher.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • dataquest-dev/DSpace#1040: Related — touches report persistence and ReportResult DAO/service contracts that intersect with removed/changed lookup methods.
  • dataquest-dev/DSpace#986: Related — both change health-report JSON generation/consumption and structured check payload handling.
  • dataquest-dev/DSpace#863: Related — changes to configured checks (names/ordering) that affect which checks HealthReport loads and runs.

Suggested labels

REVIEW-done

Suggested reviewers

  • milanmajchrak
🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is largely incomplete and off-topic; it provides only issue references and a checklist link without explaining the actual problem, analysis, or the substantial implementation changes (CLI restructuring, API removals, field path updates). Expand the description to include a 'Problem description' section explaining the issues being addressed, 'Analysis' covering the major changes (health-report CLI, report-diff ID-based comparison, API removals), and details on manual testing performed.
Docstring Coverage ⚠️ Warning Docstring coverage is 51.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is vague and generic; it uses non-descriptive terms like 'fixes' that don't clearly convey the main scope or intent of the substantial changeset involving health-report CLI rewrites, report-diff ID-based comparison, and API removals. Provide a more specific title that highlights the primary changes, such as 'Refactor health-report and report-diff CLI to use IDs and support multi-check filtering' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (1)

342-386: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Line length exceeds 120 characters in multiple places.

Lines 369-370 and 377-378 exceed the 120-character maximum. Consider breaking these into separate lines or extracting the message strings.

📏 Suggested reformat for lines 369-370
             if (Objects.isNull(sourceReportId)) {
-                handler.logInfo("Only '-t' was specified; '-s' will be set to the latest report from the "
-                        + "database.");
+                handler.logInfo("Only '-t' was specified; '-s' will be set to the latest report "
+                        + "from the database.");

As per coding guidelines, maintain maximum 120 character line length in Java code.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java` around
lines 342 - 386, In defaultReportIds, several handler.logInfo calls (the
messages for "Only '-t' was specified; '-s' will be set..." and "Only '-s' was
specified; '-t' will be set...") exceed the 120-char limit; shorten them by
splitting the long literal into two concatenated string literals or assign the
message to a local String variable and break it across lines (or use
String.format) before passing it to handler.logInfo so each source line stays
under 120 characters while keeping the same message semantics and using the
existing symbols sourceReportId, targetReportId, handler.logInfo, and
defaultReportIds.

Source: Coding guidelines

♻️ Duplicate comments (2)
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (1)

519-526: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update Javadoc to reflect ID-based comparison.

The Javadoc at line 520 still refers to "specified from and to dates," but the method now uses sourceReportId and targetReportId (IDs, not dates). Update the description to match the current implementation.

📝 Suggested Javadoc update
     /**
-     * Compare two reports based on the specified `from` and `to` dates.
+     * Compare two reports based on the specified source and target report IDs.
      * If the reports are not found, log an appropriate message.
      * If the reports are found, generate a comparison report showing the differences.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java` around
lines 519 - 526, Update the Javadoc on the compareReports method in class
ReportDiff to reflect that comparison is performed using report IDs rather than
date ranges: change the description to say it compares two reports identified by
sourceReportId and targetReportId, explains behavior when reports are missing
and that comparison uses the intersection of check names present in both
reports, and update the `@param` tags to describe sourceReportId and
targetReportId (IDs of the reports) and keep the existing `@param` context
description unchanged.
dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiffScriptConfiguration.java (1)

42-46: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Line length exceeds 120 characters.

Lines 43-44 exceed the 120-character maximum specified in the coding guidelines. Consider breaking the description string into multiple concatenated parts or reformatting the builder chain.

📏 Suggested reformat
-            Option checkOption = Option.builder("c").longOpt("check").hasArgs()
-                    .desc(String.format("Filter comparison to one or more specific checks by index (0 to %d). " +
-                            "Repeat the flag (e.g. -c 1 -c 3) to compare multiple checks from both reports.",
-                            HealthReport.getNumberOfChecks() - 1))
-                    .type(String.class)
-                    .build();
+            Option checkOption = Option.builder("c")
+                    .longOpt("check")
+                    .hasArgs()
+                    .desc(String.format(
+                            "Filter comparison to one or more specific checks by index (0 to %d). "
+                                    + "Repeat the flag (e.g. -c 1 -c 3) to compare multiple checks.",
+                            HealthReport.getNumberOfChecks() - 1))
+                    .type(String.class)
+                    .build();

As per coding guidelines, maintain maximum 120 character line length in Java code.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiffScriptConfiguration.java`
around lines 42 - 46, The description string for the Option checkOption (created
via Option.builder("c").longOpt("check").hasArgs()) exceeds 120 characters;
split the desc(...) argument into multiple concatenated string literals or
separate method call segments so no source line is longer than 120 chars (e.g.,
break after a sentence or before the String.format call parameters), keeping the
same call chain and still using HealthReport.getNumberOfChecks() - 1 to compute
the upper index.

Source: Coding guidelines

🧹 Nitpick comments (5)
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java (3)

156-248: ⚡ Quick win

Missing Javadoc for public method.

The internalRun() method is public (inherited as public from DSpaceRunnable) but lacks Javadoc documentation. As per coding guidelines, all public methods in Java files must have Javadoc comments.

📝 Add Javadoc
+    /**
+     * Execute the health report generation.
+     * Runs selected checks, persists results, writes output file (if requested), and sends email (if requested).
+     * `@throws` Exception if an error occurs during report generation
+     */
     `@Override`
     public void internalRun() throws Exception {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java`
around lines 156 - 248, Add a Javadoc comment for the public method
internalRun() in class HealthReport: document purpose, thrown exceptions,
important parameters/state (e.g., use of Context, reportFile, emails) and
side-effects (persists ReportResult, writes file, sends email), and include
`@throws` Exception to match the signature; place the comment immediately above
the internalRun() method declaration so it satisfies the project's public-method
Javadoc requirement.

Source: Coding guidelines


250-265: ⚡ Quick win

Missing Javadoc for public method.

The printHelp() method is public (inherited as public from DSpaceRunnable) but lacks Javadoc documentation. As per coding guidelines, all public methods in Java files must have Javadoc comments.

📝 Add Javadoc
+    /**
+     * Print help information for the health report script.
+     * Displays available options, defaults, and the list of available checks.
+     */
     `@Override`
     public void printHelp() {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java`
around lines 250 - 265, Add a Javadoc comment for the public method printHelp()
in class HealthReport: describe that it prints help/usage information for the
health report CLI, mention it overrides DSpaceRunnable.printHelp(), note that it
reads the configured "healthcheck.last_n_days" and lists available checks, and
include `@Override` and an `@since` or `@see` tag if project conventions require it;
place the Javadoc immediately above the public void printHelp() declaration so
the method is properly documented.

Source: Coding guidelines


271-292: ⚡ Quick win

Missing Javadoc for private helper method.

While the coding guideline focuses on public methods, adding Javadoc to printCommandlineOptions() would improve maintainability, especially given its role in persisting report arguments.

📝 Add Javadoc
+    /**
+     * Format command line options as a readable string.
+     * Deduplicates repeated option keys and includes all values for multi-value options.
+     * This method is used to record the options used when generating the report.
+     * `@return` formatted string of command line options
+     */
     private String printCommandlineOptions() {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java`
around lines 271 - 292, Add a Javadoc comment to the private helper method
printCommandlineOptions() describing its purpose (formatting and returning the
currently parsed command-line options for inclusion in the health report), its
behavior (iterates over CommandLine options, handles multi-valued options and
de-duplicates by option key), the fact that it has no parameters, and what it
returns (a formatted String of options); attach this Javadoc directly above the
printCommandlineOptions() method signature to improve maintainability and
clarity.
dspace-api/src/main/java/org/dspace/app/healthreport/HealthReportScriptConfiguration.java (1)

33-58: ⚡ Quick win

Missing Javadoc for public method.

The getOptions() method is public but lacks Javadoc documentation. As per coding guidelines, all public methods in Java files must have Javadoc comments.

📝 Add Javadoc
+    /**
+     * Get the CLI options for the health report script.
+     * `@return` Options configured options for the script
+     */
     `@Override`
     public Options getOptions() {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@dspace-api/src/main/java/org/dspace/app/healthreport/HealthReportScriptConfiguration.java`
around lines 33 - 58, Add a Javadoc block above the public method getOptions()
in class HealthReportScriptConfiguration describing its purpose (builds and
returns CLI Options for the health report script), the fact it caches the
Options instance, the return type (Options), and any behavior notes (which flags
are added, e.g. -h/--help, -e/--email, -c/--check, -f/--for, -r/--report);
include `@return` describing the returned Options and `@since/`@author tags if your
project requires them.

Source: Coding guidelines

dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java (1)

317-322: 💤 Low value

Dead code: XOR validation is unreachable after defaultReportIds().

The check at lines 318-322 validates that both sourceReportId and targetReportId are set (neither is null while the other isn't). However, this method is only called after defaultReportIds() has already executed (line 256), which fills any missing ID. After defaultReportIds(), either both IDs are set or both remain null (if insufficient reports exist in the DB), so the XOR condition will never be true.

Consider removing lines 318-322 or moving this validation earlier in the flow (before defaultReportIds()) if you want to enforce that users must supply both IDs explicitly.

♻️ Option 1: Remove dead XOR check
     private boolean validateReportIdSelection() {
-        if ((Objects.isNull(sourceReportId) && Objects.nonNull(targetReportId))
-                || (Objects.nonNull(sourceReportId) && Objects.isNull(targetReportId))) {
-            handler.logError("Both 'source' and 'target' report IDs must be specified.");
-            return false;
-        }
-
         if (Objects.nonNull(sourceReportId) && sourceReportId <= 0) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java` around
lines 317 - 322, The XOR null-check in validateReportIdSelection() (the
conditional that logs via handler.logError when exactly one of
sourceReportId/targetReportId is null) is dead after defaultReportIds()
populates missing IDs; either remove that XOR branch from
validateReportIdSelection() or call validateReportIdSelection() before
defaultReportIds() so the XOR check can catch user-supplied-only cases. Update
the call site around defaultReportIds()/validateReportIdSelection() accordingly
and keep the handler.logError usage (or remove it if you choose deletion) to
ensure behavior remains consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java`:
- Around line 342-386: In defaultReportIds, several handler.logInfo calls (the
messages for "Only '-t' was specified; '-s' will be set..." and "Only '-s' was
specified; '-t' will be set...") exceed the 120-char limit; shorten them by
splitting the long literal into two concatenated string literals or assign the
message to a local String variable and break it across lines (or use
String.format) before passing it to handler.logInfo so each source line stays
under 120 characters while keeping the same message semantics and using the
existing symbols sourceReportId, targetReportId, handler.logInfo, and
defaultReportIds.

---

Duplicate comments:
In `@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java`:
- Around line 519-526: Update the Javadoc on the compareReports method in class
ReportDiff to reflect that comparison is performed using report IDs rather than
date ranges: change the description to say it compares two reports identified by
sourceReportId and targetReportId, explains behavior when reports are missing
and that comparison uses the intersection of check names present in both
reports, and update the `@param` tags to describe sourceReportId and
targetReportId (IDs of the reports) and keep the existing `@param` context
description unchanged.

In
`@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiffScriptConfiguration.java`:
- Around line 42-46: The description string for the Option checkOption (created
via Option.builder("c").longOpt("check").hasArgs()) exceeds 120 characters;
split the desc(...) argument into multiple concatenated string literals or
separate method call segments so no source line is longer than 120 chars (e.g.,
break after a sentence or before the String.format call parameters), keeping the
same call chain and still using HealthReport.getNumberOfChecks() - 1 to compute
the upper index.

---

Nitpick comments:
In `@dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java`:
- Around line 156-248: Add a Javadoc comment for the public method internalRun()
in class HealthReport: document purpose, thrown exceptions, important
parameters/state (e.g., use of Context, reportFile, emails) and side-effects
(persists ReportResult, writes file, sends email), and include `@throws` Exception
to match the signature; place the comment immediately above the internalRun()
method declaration so it satisfies the project's public-method Javadoc
requirement.
- Around line 250-265: Add a Javadoc comment for the public method printHelp()
in class HealthReport: describe that it prints help/usage information for the
health report CLI, mention it overrides DSpaceRunnable.printHelp(), note that it
reads the configured "healthcheck.last_n_days" and lists available checks, and
include `@Override` and an `@since` or `@see` tag if project conventions require it;
place the Javadoc immediately above the public void printHelp() declaration so
the method is properly documented.
- Around line 271-292: Add a Javadoc comment to the private helper method
printCommandlineOptions() describing its purpose (formatting and returning the
currently parsed command-line options for inclusion in the health report), its
behavior (iterates over CommandLine options, handles multi-valued options and
de-duplicates by option key), the fact that it has no parameters, and what it
returns (a formatted String of options); attach this Javadoc directly above the
printCommandlineOptions() method signature to improve maintainability and
clarity.

In
`@dspace-api/src/main/java/org/dspace/app/healthreport/HealthReportScriptConfiguration.java`:
- Around line 33-58: Add a Javadoc block above the public method getOptions() in
class HealthReportScriptConfiguration describing its purpose (builds and returns
CLI Options for the health report script), the fact it caches the Options
instance, the return type (Options), and any behavior notes (which flags are
added, e.g. -h/--help, -e/--email, -c/--check, -f/--for, -r/--report); include
`@return` describing the returned Options and `@since/`@author tags if your project
requires them.

In `@dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java`:
- Around line 317-322: The XOR null-check in validateReportIdSelection() (the
conditional that logs via handler.logError when exactly one of
sourceReportId/targetReportId is null) is dead after defaultReportIds()
populates missing IDs; either remove that XOR branch from
validateReportIdSelection() or call validateReportIdSelection() before
defaultReportIds() so the XOR check can catch user-supplied-only cases. Update
the call site around defaultReportIds()/validateReportIdSelection() accordingly
and keep the handler.logError usage (or remove it if you choose deletion) to
ensure behavior remains consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 34707414-4957-4bfc-8053-9a5cda1c39df

📥 Commits

Reviewing files that changed from the base of the PR and between 2f03408 and 94e53d2.

📒 Files selected for processing (13)
  • dspace-api/src/main/java/org/dspace/app/healthreport/HealthReport.java
  • dspace-api/src/main/java/org/dspace/app/healthreport/HealthReportScriptConfiguration.java
  • dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
  • dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiffScriptConfiguration.java
  • dspace-api/src/main/java/org/dspace/content/ReportResultServiceImpl.java
  • dspace-api/src/main/java/org/dspace/content/dao/ReportResultDAO.java
  • dspace-api/src/main/java/org/dspace/content/dao/impl/ReportResultDAOImpl.java
  • dspace-api/src/main/java/org/dspace/content/service/ReportResultService.java
  • dspace-api/src/main/resources/report-diff-fields.json
  • dspace-api/src/test/java/org/dspace/scripts/HealthReportIT.java
  • dspace-api/src/test/java/org/dspace/scripts/ReportDiffIT.java
  • dspace-server-webapp/src/main/java/org/dspace/app/rest/repository/ScriptRestRepository.java
  • dspace/config/launcher.xml
💤 Files with no reviewable changes (4)
  • dspace-api/src/main/java/org/dspace/content/ReportResultServiceImpl.java
  • dspace-api/src/main/java/org/dspace/content/service/ReportResultService.java
  • dspace-api/src/main/java/org/dspace/content/dao/ReportResultDAO.java
  • dspace-api/src/main/java/org/dspace/content/dao/impl/ReportResultDAOImpl.java

Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
try (Context context = new Context()) {
defaultDate(context);
// If at least one of -s/-t is missing, fill missing values from latest reports.
if (Objects.isNull(sourceReportId) || Objects.isNull(targetReportId)) {

@kuchtiak-ufal kuchtiak-ufal Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment:
May be shorter version:

if (sourceReportId == null || tergetReportId == null) {
...
)

Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
Comment thread dspace-api/src/main/java/org/dspace/app/reportdiff/ReportDiff.java Outdated
…ecks section, validate report IDs before defaulting

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@milanmajchrak milanmajchrak merged commit 3dc9cec into dtq-dev Jun 12, 2026
11 checks passed
milanmajchrak added a commit that referenced this pull request Jul 2, 2026
* UFAL/Fixed failing integration test (ufal#1332) (#1249)

* Add debug messages to fauling test

(cherry picked from commit 4cc3694)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* [Port to dtq-dev] Fix OpenAIRE integration: null handling and HTTP client lifecycle (#1248)

* Fix OpenAIRE integration: null handling and HTTP client lifecycle (ufal#1330)

* Add test for OpenAIRE connector

* Initial plan

* Add null check for OpenAIRE response to prevent NullPointerException

Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>

* Fix HTTP client lifecycle to prevent premature connection closure

Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>

* Keep the try with resources but copy the response

into an in memory stream and return that

* license:check

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>
(cherry picked from commit 02984db)

* Handle NumberFormatException in OpenAIREFundingDataProvider.getNumberOfResults and use explicit UTF-8 charset in OpenAIRERestConnectorTest

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>
Co-authored-by: milanmajchrak <milan.majchrak@dataquest.sk>

* UFAL/Added a comment to do not forget mounting the file which is changed via ocnfiguration feature (#1247)

* UFAL/Issue 1315: Store file preview to database when file preview is created on Item Page load. (ufal#1316) (#1241)

* Issue ufal/clarin-dspace1315: Store file preview to database when file preview is created on item page load

* assert text improvement

* PR comments: commit context only when any of the file preview is successfully created

* change variable name

(cherry picked from commit aab626b)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/Issue 1313: fixed error when file preview is not generated for bitstream with store_number = 77 (ufal#1318) (#1240)

* Issue ufal#1313: fixed error when file preview is not generated for bitstream with store number = 77

* resolve MR comments

(cherry picked from commit 04d64f7)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/Nw version metadata issues (#1236)

* Issue ufal#1266: dc.date.available and dc.relation.replaces metadata not cleared properly (ufal#1307)

* Issue ufal#1266: dc.date.available and dc.relation.replaces metadata not cleaned properly in new item version

* resolve MR comments - update ignoredMetadataFields in versioning-service.xml

* update ClarinVersionedHandleIdentifierProviderIT test to check dc.identifier.uri metadata for new version

(cherry picked from commit 7ffaf9a)

* Issue 1319: do not copy dc.identifier.doi metadata when new item version is created

(cherry picked from commit 1b7ed17)

---------

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/Fix: add bitstream download-by-handle endpoint for curl instructions (#1252)

* fix: add bitstream download-by-handle endpoint for curl instructions

Adds GET /api/core/bitstreams/handle/{prefix}/{suffix}/{filename} endpoint
that directly serves bitstream content by item handle and filename.

This resolves the issue where curl download instructions generated by the
UI produced URLs pointing to non-existent backend endpoints, resulting in
404 errors when users attempted to download files via command line.

The new endpoint resolves the handle to an Item, finds the bitstream by
exact filename in ORIGINAL bundles, and streams the raw content with
correct Content-Type and Content-Disposition headers.

Refs: dataquest-dev/dspace-angular#1210

* Fixed compliing errors

* Small refactoring - use constants and removed unnecessary changes

* added comments, return 404 status instead of 402

* unauthorized instead of forbidden

* fix: use RFC 5987 Content-Disposition for non-ASCII filenames

curl -J on Windows cannot create files with non-ASCII characters (e.g.
diacritics like e/a) from a raw UTF-8 Content-Disposition filename header.

Uses filename*=UTF-8''percent-encoded-name (RFC 5987/6266) which curl
properly decodes. Also includes an ASCII fallback in filename param.

* fix: move context.complete() after streaming to prevent truncated downloads

context.complete() was called before bitstreamService.retrieve(), closing
the DB connection and causing 'end of response with X bytes missing' errors.
Now context.complete() is called only after the full content has been streamed.
For S3 redirect and HEAD paths, context.complete() remains before return
since no streaming is needed.

* fix: use real UTF-8 filename in Content-Disposition instead of ASCII fallback

The filename parameter now contains the original name (with diacritics like
e/a) instead of replacing non-ASCII chars with underscores. Characters in
the ISO-8859-1 range are transmitted correctly by Tomcat and understood by
curl on Western/Central-European systems. The filename* parameter still
provides RFC 5987 percent-encoded UTF-8 for modern clients (curl 7.56+).

* fix: revert to ASCII fallback in Content-Disposition, add edge-case tests

Content-Disposition filename parameter now uses ASCII fallback (non-ASCII
replaced with underscore) per RFC 6266. Modern clients use filename* (RFC
5987) which has the full UTF-8 name. The curl command no longer relies on
Content-Disposition at all (uses -o instead of -OJ).

New integration tests for edge cases:
- Multiple dots in filename (archive.v2.1.tar.gz)
- Double quotes in filename (escaped in Content-Disposition)
- CJK characters (beyond ISO-8859-1)
- Same filename in ORIGINAL and TEXT bundles (only ORIGINAL served)

* fix: resolve compilation errors and fix IT test assertions

- Remove duplicate HttpStatus import (apache vs spring)
- Add missing MediaType import (spring)
- Fix Content-Type assertion to include charset=UTF-8
- Use URI.create() for pre-encoded URLs in tests to prevent
  double-encoding (%25) rejection by StrictHttpFirewall

All 15 integration tests pass.

* test: add complex filename test (diacritics, plus, hash, unmatched paren)

New IT test for filename 'Media (+)#9) ano' verifying correct URL decoding,
Content-Disposition encoding, and content delivery. 16/16 tests pass.

* fix authorization, comments, tests

* fix: change expected status from 401 to 403 for authenticated non-admin user

The test downloadBitstreamByHandleUnauthorizedForNonAdmin uses getClient(token)
which means the user IS authenticated. The controller correctly returns 403
(Forbidden) for authenticated users without access, not 401 (Unauthorized).
401 is only for anonymous/unauthenticated requests.

---------

Co-authored-by: Paurikova2 <michaela.paurikova@dataquest.sk>

* Reduce warn logs noise (#1268)

* Log 404 responses at DEBUG instead of WARN to reduce log noise

* Log 404 responses at DEBUG instead of WARN (configurable via logging.server.debug-404)

* Skip stack trace extraction for suppressed 404 debug logs

* Replace custom debug-404 property with dedicated Log4j2 logger (org.dspace.app.rest.NotFound)

* Suppress 404 warn logs via dedicated Log4j2 logger (org.dspace.app.rest.NotFound)

* Turn off that warn logs for the dspace.log

* Updated log name to be more unique

* The row lenght was updated to be less than 120 chars (#1274)

* Reduce noisy WARN logs to DEBUG level (#1269)

Changed two frequently occurring WARN log messages to DEBUG level:
- Context.java: 'Initializing a context while an active transaction exists'
- ClarinItemServiceImpl.java: 'Cannot update item dates metadata because the approximate date is empty'

* Added oai bundle exclude feature

* Updated docs

* Fix OAI bundle exclusion docs and isolate OAIPMHBundleExposureIT config state

* Fix indentation in OAIPMHBundleExposureIT field declaration

* Updated docs

* fix failing Curation tests (ufal#1353) (#1304)

* fix RequiredMetadataIT failure

* different fix for failing curator tests

* change response to see last bitstream format results

* cleaning custom bitstream format creation in PreviewContentServiceImplIT test

* add debug messages

* IIIFCacheEventConsumer: don't consume events when event subject is null

(cherry picked from commit 50db8cd)

**NOTE**: This is without the `dspace-api/src/test/java/org/dspace/curate/ItemMetadataQACheckerIT.java` change
will add that one into #1237

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/Remove 'clariah' submission process (#1305)

Removed the 'clariah' submission process it is not used

(cherry picked from commit cae96d5)

* UFAL/Issue 1349: admin user is not allowed to delete himself/herself (ufal#1350) (#1306)

* Issue 1349: admin user is not allowed to delete himself/herself

* improve the fix: test context.getCurrentUser() for null

* throw IllegalStateException rather than AuthorizeException, and allow client to see the error message

(cherry picked from commit b913627)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* Issue 1354: add dc.relation.isreplacedby only when item is installed (ufal#1356) (#1308)

* Issue 1354: add dc.relation.isreplacedby only when item is installed

* set dc.relation.replaces on new item creation

* fixed JavaDoc

* fixed failing tests

* test if dc.relation.isreplacedby metadata are only added when new item version is installed

* use Context#reloadEntity rather than calling find method

* removing unused field

* also removing the now unused import

---------


(cherry picked from commit 3e204e5)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/Issue 1339: fixed NPE when hidden item metadata are checked for the item with deleted submitter (ufal#1344) (#1290)

* Issue 1339: fixed NPE when hidden item metadata are checked for the item with deleted submitter

(cherry picked from commit 64fda50)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* Issue 1321: disable File preview for files where user has no Bitstream READ permission (ufal#1327) (#1280)

* Issue 1321: disable File preview for files where the user has no Bitstream READ permission

* alow file preview in case only the License agreement is needed

* don't allow to create file preview for non-authorized user, nor for item that requires license confirmation

* fixed failing FilePreviewIT test. Now only the user with file READ permission can generate file preview

* add more tests for HTML file preview

* add test for HTML File preview

* improve warning messages

* extend test to see if non admin user can see already generated file preview

---------


(cherry picked from commit 50d7bbc)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/issue 1324: curation task - implement 3 types of reporters to allow proper report writing to selected destination (ufal#1326) (#1264)

* issue 1324: implement 3 types of reporters to allow proper writing to file or to console

* don't throw exception when not needed

* no AbstractUnitTest required - AbstractDSpaceTest is sufficient

* improve append(str) method in reporters, by using StringUtils.chomp() method

(cherry picked from commit 6f19d1f)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* Issue ufal#1292 link/unlink items with version relationship (ufal#1304) (#1253)

* Issue 1292: script to allow link two items into version relationship

* implement link and unlink actions

* ItemVersionLinkerIT test

* improve ItemVersionLinkerIT, fix ScriptRestRepositoryIT

* improve test to be more realistic

* improve options description

* better call of itemService.clearMetadata()

* clear correctly dc.relation.replaces and dc.relation.isreplacedby

* use dc.identifier.uri metadata value rather than item.getHandle() to set dc.relation.replaces and dc.relation.isreplacedby

* code-cleanup

* add also as a cli script

* Use Item.ANY instead of null

tested with the production db dump on the items mentions in the issue
(11234/1-5537). It was returning:
```
The script has started
Item '11234/1-5537' has no handle assigned.
```

because it's dc.identifier.uri.*

---------



(cherry picked from commit 903b35a)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/[Port to dtq-dev] Port ItemMetadataQAChecker curation task from v5 to v7 (#1237)

* Port ItemMetadataQAChecker curation task from v5 to v7 (ufal#1312)

* Add ItemMetadataQAChecker curation task with tests

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>
(cherry picked from commit d5517be)

* Issue ufal#1310 curation task to check relation metadata (ufal#1325)

* issue 1310: check versioning releationship for items with relation metadata

* improve implementation + test

* More readable, I think.

* update logging

add the handle of the referenced item where possible

* a test case to cover "no related item"

* improve the failure message

---------

Co-authored-by: Ondřej Košarko <ko_ok@centrum.cz>
(cherry picked from commit c97f406)

* fix failing Curation tests (ufal#1353)

* fix RequiredMetadataIT failure

* different fix for failing curator tests

* change response to see last bitstream format results

* cleaning custom bitstream format creation in PreviewContentServiceImplIT test

* resolve MR comments

* add debug messages

* more debug messages

* IIIFCacheEventConsumer: don't consume events when event subject is null

(cherry picked from commit 50db8cd)

**NOTE**: this is just `dspace-api/src/test/java/org/dspace/curate/ItemMetadataQACheckerIT.java` the rest is in #1304

* removed empty line

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>
Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/Separate CLARIN license payload from sections.license (#1319)

* fix(submission): separate CLARIN license payload from sections.license

* fix(clarin-license): rename step path to /select, rename DataClarinLicense to ClarinDataLicense, and clean up comments

* fix(clarin-license): align DTO name with Rest suffix convention

* fix(submission): align CLARIN section DTO naming and apply Copilot review fixes

* Align CLARIN license patch semantics and tighten section path handling

* Fix checkstyle issue

* Stabilize unknown CLARIN license metadata assertion

* Added doc and checked null value

* Harden CLARIN license patch handling and logging

* UFAL/Fix clarin-license IT: expect 422 for empty value on select patch (#1323)

* Expect 422 for empty value on clarin-license select patch

* Treat blank value as clear on clarin-license section select

* UFAL/Fix refbox buttons (ufal#1367) (#1318)

* adding test and fixing the issue

fixes ufal#1366 there was a conflict between the
produces=application/json and response.setContentType("application/xml")

* Strengthen citations endpoint test assertions for ufal#1366 regression coverage

Agent-Logs-Url: https://github.com/ufal/clarin-dspace/sessions/e385ef9e-8186-4899-b811-fc82bbfa942b



---------



(cherry picked from commit 8400982)

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>

* Health report, report diff fixes (#1254)

* fix(health-report): fix CLI args, multi-check support, and report-diff comparison logic

* Fix ReportDiff setup and date validation

* fix failed integration test

* added tests, used -c 1 2 instead of -c 1 -c 2

* used UNLIMITED_VALUES unstead of MAX_VALUE

* improved doc

* used multilist for -c , removed unused method

* improved doc

* WIP updated health report and report diff

* Complete update of health-report and report-diff

* Sorting reports and updating tests, plus enable multiple -c in report-diff

* Improved docs, output and info and logic

* Improved comparision of reports w/o changes, or with only one report specified

* Updated tests

* Removed unused Report and refactor getChecks

* Address review comments: simplify null checks, deduplicate skipped-checks section, validate report IDs before defaulting

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Matus Kasak <matus.kasak@dataquest.sk>
Co-authored-by: Paurikova2 <michaela.paurikova@dataquest.sk>
Co-authored-by: milanmajchrak <milan.majchrak@dataquest.sk>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

* Fix flaky tests in IT pipeline (#1321)

* Fixed integration tests because they use to fail sometimes

* test: stabilize flaky CI tests (Hibernate cleanup retry, Shibboleth auth sequence reset, ORCID assertion hardening)

* test: fix flaky ITs at the source (live ORCID, Shibboleth config-reload) + Hibernate CME diagnostics

ORCID CachingOrcidRestConnectorTest no longer hits the live ORCID sandbox:
search/getLabel/search_fail mock the HTTP layer (httpGet made protected) with a
canned expanded-search response, so they are deterministic instead of asserting
against fluctuating sandbox data.

Shibboleth WWW-Authenticate flakiness: add a test-only config-definition.xml with
config-reload=false. Runtime setProperty(...AuthenticationMethod...) overrides were
silently discarded whenever the auto-reload listener rebuilt the combined config
(restoring clarin-dspace.cfg's [Password, ClarinShib] default), intermittently
leaking 'password realm' into the header. Verified: with auto-reload off the override
survives; the explicit reloadConfig() reset in @after still works.

Hibernate ConcurrentModificationException in @after cleanup: the per-session JDBC
ResourceRegistry is not thread-safe, so the CME means two threads touch one Session.
Capture a full thread dump on CME (target/cme-dumps/) to identify the colliding
thread in CI; keep a resilient retry so an already-passed test isn't failed by this
teardown race. (Context.finalize() ruled out: sessions are thread-local.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: revert IT-env config-reload=false override

Disabling config auto-reload globally in the test environment broke
AuthorizeConfigIT.testReloadConfiguration, which deliberately verifies that
AuthorizeConfiguration picks up live changes written to local.cfg via the
auto-reload mechanism. Auto-reload is a tested feature here, so it must not be
disabled to work around the Shibboleth WWW-Authenticate flakiness.

The Shibboleth flakiness (runtime setProperty override discarded when the combined
config is rebuilt) needs a reload-safe fix in the auth test instead; tracked
separately.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: make Shibboleth auth-sequence override reload-safe (fix WWW-Authenticate flakiness)

The flaky 'password realm' leak in AuthenticationRestControllerIT had this root cause:
configurationService.setProperty(plugin.sequence...AuthenticationMethod, ...) only
updates the in-memory view of the combined configuration. That view is discarded
whenever it is rebuilt, and the auto-reload listener rebuilds it as soon as any
reloadable cfg file's mtime changes mid-run (e.g. another test writing local.cfg).
When that rebuild lands between the override and the request, clarin-dspace.cfg's
default [PasswordAuthentication, ClarinShibAuthentication] returns and 'password realm'
leaks into the header. The previous clear-then-set helper did not help (it is
equivalent to a plain setProperty).

Fix: set the sequence via a JVM system property (highest-precedence override layer,
re-read on every rebuild) + reloadConfig(), and clear it in @after. This survives
auto-reload without disabling it (so AuthorizeConfigIT, which verifies auto-reload,
still passes).

Verified in the real /api/authn/status endpoint: an explicit reloadConfig() after a
setProperty override reproduces the leak, while the system-property approach keeps the
header Shibboleth-only across rebuilds. Full AuthenticationRestControllerIT (43 tests)
passes, and running it alongside ClarinAuthenticationRestControllerIT /
AnonymousAdditionalAuthorizationFilterIT confirms the property does not leak across classes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: add Hibernate concurrency monitor + CI upload to pinpoint @after CME

The intermittent ConcurrentModificationException in @after cleanup is a genuine
cross-thread data race on Hibernate's per-session, non-thread-safe JDBC
ResourceRegistry (xref): a second thread mutates the test thread's session while
it commits/rolls back. Verified against hibernate-core-5.6.15 sources that the
releaseResources forEach lambda never touches xref, so single-thread re-entrancy
is impossible (this disproves the earlier HHH-15116 single-thread theory). The
window is microseconds, so it does not reproduce locally even with deliberate
cross-thread session sharing; it only surfaces under CI load.

A live thread dump of a running IT JVM shows NO legitimate background thread ever
touches Hibernate (all are Solr/HTTP/Jetty/JVM). So the culprit is a transient
thread, and any non-test thread caught inside Hibernate JDBC/session code is by
definition the offender.

- HibernateConcurrencyMonitor: JVM-wide background sampler that records (de-duped)
  any non-test thread found inside org.hibernate.{resource.jdbc,engine.jdbc,
  internal.SessionImpl}; flushed to target/cme-dumps/ on CME and at JVM shutdown.
  Pure observer, never changes test behaviour.
- AbstractIntegrationTestWithDatabase: start the monitor and mark the JUnit thread
  in setUp; flush it alongside the existing thread dump on a captured CME.
- build.yml: always-upload **/target/cme-dumps/** (not gated on failure) so a
  successful cleanup retry no longer hides the diagnostic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: don't close iterate() Hibernate stream from a finalize() (root cause of flaky CME)

Root cause of the intermittent ConcurrentModificationException in @after integration-test
cleanup, identified via the HibernateConcurrencyMonitor CI dumps: the GC Finalizer thread,
running org.dspace.core.AbstractHibernateDAO$1.finalize(), closed the Hibernate Stream
returned by AbstractHibernateDAO.iterate(). Closing a stream closes its ScrollableResults,
which mutates the owning Session's per-session, non-thread-safe JDBC ResourceRegistry (xref)
- but on the Finalizer thread, concurrently with the thread that owns the session. When that
collided with the owning thread's commit/rollback (releaseResources -> xref.forEach), it threw
ConcurrentModificationException. The CI dumps showed this exact finalizer stack as the only
non-test thread inside Hibernate in dspace-api, and present in dspace-server-webapp too.

This was confirmed genuine cross-thread access (not the previously assumed single-thread/HHH
bug): verified against hibernate-core-5.6.15 sources that the releaseResources forEach lambda
never touches xref, so single-thread re-entrancy is impossible.

Fix: close the backing stream on the owning thread when iteration is exhausted, and remove the
finalize() override. An iterator abandoned before exhaustion is released safely when its
Context/Session is closed (releaseResources then runs on the owning thread).

Adds AbstractHibernateDAOIteratorIT to guard against reintroducing a stream-closing finalizer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: remove broken Context.finalize() that leaked finalizer-thread sessions

Context.finalize() ran on the GC Finalizer thread and called
dbConnection.isTransActionAlive()/abort(), which resolve sessionFactory.getCurrentSession()
to a brand-new session bound to the Finalizer thread - never the (now-unreachable) thread
that opened the Context. So it could not roll back the Context's transaction anyway; it only
opened and leaked a throwaway Hibernate session on the Finalizer thread, and threw
IllegalStateException once the SessionFactory was closed (seen in the CI thread dumps used to
diagnose the flaky integration-test ConcurrentModificationException).

Abandoned Contexts are cleaned up safely when their owning thread's session ends; callers
already close Contexts via complete()/abort()/try-with-resources (Context is AutoCloseable).

Removes the now-redundant ContextTest.testFinalize (close()/abort() are covered by
testClose/testAbort/testAbort2).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: remove flaky-CME diagnostic scaffolding and teardown retry (root cause fixed)

The intermittent @after ConcurrentModificationException is now fixed at its source
(AbstractHibernateDAO.iterate no longer closes its Hibernate stream from a finalizer; broken
Context.finalize() removed). The temporary diagnostics that pinpointed it are no longer needed:

- Restore AbstractIntegrationTestWithDatabase.destroy() to its plain form (drop the 3x cleanup
  retry and the per-CME thread dump) and remove the HibernateConcurrencyMonitor wiring.
- Delete HibernateConcurrencyMonitor.
- Revert the build.yml always-upload of target/cme-dumps.

CI keeps -Dfailsafe.rerunFailingTestsCount=2 as the generic flaky-test safety net.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* revert: keep Context.finalize() (out of scope, not the CME cause)

Reverts the Context.finalize() removal (and the ContextTest.testFinalize deletion). The flaky
@after ConcurrentModificationException is fully fixed by the AbstractHibernateDAO.iterate()
change alone; Context.finalize() runs on a single GC Finalizer thread against its own
finalizer-thread session and provably cannot cause that cross-thread xref race. Removing a
finalizer from this core, widely-used class is a riskier change that does not belong in a
flaky-test fix, so leave Context untouched. The (pre-existing, harmless) finalizer-thread
session it opens can be addressed separately if desired.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: address review comments on flaky-test fix

- AbstractHibernateDAOIteratorIT: add Javadoc to the test method and walk the iterator's full
  class hierarchy (up to Object) when asserting no finalize() override, so a finalizer
  reintroduced on a superclass/helper is also caught (per CodeRabbit review).
- AuthenticationRestControllerIT: wrap an over-length (122 char) Javadoc line.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* Security patches from vanilla DSpace 7.6.7 (CVE-2026-49830, CVE-2026-49831) (#1340)

* ORE aggregated resource URI validation

(cherry picked from commit 7ac17f6)

* Velocity and template safety for Email and LDN messages

* Safer Velocity configuration
* New "message.templates.allowed-config" config
* Remove "UnmodifiableConfiguration" in favour of a
  simple Map of whitelisted Config keys/values
* Centralise Velocity config in core Utils
* Small javadoc changes

(cherry picked from commit b2d6141)
(cherry picked from commit 5b31db5)

* Better null checking in allowed config props

(cherry picked from commit 6b66531)
(cherry picked from commit 46a0dfb)

* Access configurationService at runtime, not rely on class setup

(cherry picked from commit 5803819)
(cherry picked from commit 4be430f)

* Remove strict mode Velocity engine configuration (allow nulls)

(cherry picked from commit 655fc62)

* Filter requests for JSPs or traversal

(cherry picked from commit cf9be85)
(cherry picked from commit dc3e455)

* Add additional logging to GlobalRequestSecurityFilter

(cherry picked from commit 295a046)
(cherry picked from commit 0b1deae)

* Fix import order

(cherry picked from commit e2e6a79)
(cherry picked from commit 2e40077)

* Update sitemap traversal test expectations

(cherry picked from commit 56ae287)
(cherry picked from commit 1a3dfd7)

* Backport GlobalRequestSecurityFilter for javax

(cherry picked from commit 8a2eee9)

* Add secure file access methods

(cherry picked from commit 22bec44)

* Backport Curation I/O using secure file access

Removes some JDK >= 16 usage

(cherry picked from commit 55905a2)

* Curation config support for allowed base paths

(cherry picked from commit 4502224)

* Move curation -r reporter param to CLI only

(cherry picked from commit 277af82)

* Fix import order

(cherry picked from commit a757221)

* Ignore CurationScriptIT -T taskFile tests, to rewrite w/ CLI

(cherry picked from commit 6437472)
(cherry picked from commit 37cd6eb)

* Move taskfile -T option to CLI script config only

(cherry picked from commit 00e4979)
(cherry picked from commit 27708ea)

* UFAL/Allow lr.help.mail in email template config allowlist

The 7.6.7 Velocity hardening restricts templates to an allowlisted
"config" map (Utils.getAllowedTemplateConfig). UFAL/CLARIN templates
(clarin_download_link_admin, clarin_token, matomo_report,
share_submission) reference config.get('lr.help.mail'), which vanilla
DSpace does not ship, so it was missing from the allowlist and those
emails would render a null help address. Add it to dspace.cfg only;
Utils.java stays identical to upstream.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Kim Shepherd <kim@shepherd.nz>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* Autolabel for new issues (#1341)

Co-authored-by: Matus Kasak <matus.kasak@dataquest.sk>

* [Port to dtq-dev] Issue 1364: tgz file preview fix (#1338)

* Issue 1364: tgz file preview fix (ufal#1372)

* Issue 1364: tgz file preview fix

* resolve MR comments + fixed test

* Update help message for force preview option

* get rid of the password requirement on FilePreview script

---------

Co-authored-by: Ondřej Košarko <kosarko@ufal.mff.cuni.cz>
(cherry picked from commit 00a2a37)

* PR Comments

---------

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* [Port to dtq-dev] Issue 1343: add PUT and DELETE endpoint methods to ClarinLicenseLabel REST repository (#1325)

* Issue 1343: add PUT and DELETE endpoint methods to ClarinLicenseLabel REST repository (ufal#1357)

* Issue 1343: add PUT and DELETE endpoint methods to ClarinLicenseLabelRest repository

* resolve Copilot comments

* fixed PUT request in ClarinLicenseLabelRestRepository

* added check for Clarin License Label -> Label string to be shorter that 5 characters

* implement coorrect put method in ClarinLicenseLabelRestRepository

* add constraints to license_label table: made label UNIQUE, make license_label not deletable when used in clarin licenses

* change order of deleting objects in test cleanup(): delete license objects before license_label objects (to satisfy license_label constraints)

* resolve PR Copilot comments, fixed failing ClarinWorkspaceItemRestRepositoryIT

* prevent creating duplicate Clarin License Labels in REST API

* not necessary to trim label twice

* minor fixes, suggested by Copilot

* Rename SQL migration files to use today's date (2026.06.01)

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
(cherry picked from commit b041c90)

* Fix test compilation: update createClarinLicense call sites for new label arg

The backport added a `label` String parameter to the createClarinLicense
test helper but left six 4-arg call sites unchanged, breaking testCompile
in dspace-server-webapp. Pass a label at each remaining call site
("lbl"; "lbl1"/"lbl2" for the paired-license test) to match the helper's
new signature, preserving the previously hard-coded "lbl" behaviour.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* PR comments: code cleanup

---------

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* AI-Skills/Wire private AI skills submodule (.dspace-skills) (#1343)

* test: de-flake ORCID cache tests and ZIP-download IT (#1344)

* test: de-flake ORCID cache tests and ZIP-download IT

Two independent flaky tests keep turning the dtq-dev pipeline red after #1321:

1. CachingOrcidRestConnectorTest.testCachable / testCacheableWithError still
   hit the live ORCID sandbox (#1321 only mocked getLabel/search/search_fail).
   They use the real Spring @Cacheable bean (to exercise the CGLIB caching
   proxy), so they could not be spied. Point the bean's apiURL at a local
   MockWebServer serving the canned orcid-expanded-search.xml instead -- keeps
   the real caching proxy and HTTP transport under test, removes the network
   dependency. No production change.

2. MetadataBitstreamControllerIT.downloadAllZip compared the response to a
   locally-built ZIP byte-for-byte; a ZIP entry's DOS timestamp (2s resolution)
   defaults to "now" on both sides and differs across a 2s boundary. Assert the
   unzipped entry name + content instead of raw bytes.

Test-only changes. Verified locally (ORCID class 8/8 green, repeated offline
runs; webapp test module compiles; checkstyle clean).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: assert exactly one ZIP entry in downloadAllZip

Address CodeRabbit: a Map keyed by entry name could mask duplicate ZIP entries
(same filename overwrites). Track the entry count separately and assert it is 1,
so an unexpected extra entry fails the test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: de-flake SSR authorization and Solr TopCountries ITs

Two more intermittent failures on dtq-dev, both addressed at the root cause
(test-only changes):

1. AuthorizationRestRepositoryIT.findByObjectSSRTest flaked with 400 instead of
   200. The test sets dspace.server.ssr.url (used by Utils.getBaseObjectRestFromUri
   to resolve the request URI) and the AlwaysThrowExceptionFeature.turnoff flag via
   configurationService.setProperty(...). Such in-memory overrides are silently
   dropped when the combined config is rebuilt by the auto-reload listener (fires on
   any reloadable cfg file mtime change mid-run). When ssr.url is dropped the URI no
   longer resolves -> 400; if turnoff were also dropped the /search/object path would
   let alwaysexception throw -> 500. Fix: set both via JVM system properties (+
   reloadConfig) so they sit in the highest-precedence override layer and survive
   auto-reload, cleared in @after. Same pattern as #1321's Shibboleth fix
   (AuthenticationRestControllerIT#setAuthenticationMethodSequence). Applied to all
   three SSR tests.

2. StatisticsRestRepositoryIT.topCountriesReport_Community_Visited flaked with an
   empty report (points: []). postView() commits with waitSearcher=false, so the
   just-posted view events can be invisible to the immediately-following report query
   (and a dropped solr-statistics.autoCommit override would skip the commit entirely).
   Fix: after posting the view events, force solrLoggerService.commit() (waitSearcher
   =true) so the events are flushed and visible before the report is queried.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* [Backport dtq-dev] Issue 1360: allow to change license in workflow item, in PATCH operation (ufal#1365) (#1327)

* Issue 1360: allow to change license in workflow item, in PATCH operation (ufal#1365)

* Issue 1360: allow to change license in workflow item, in PATCH operation

* improve fix + Integration test

* resolve Copilot comments

* resolve more MR comments

* return 404 in case PATCH request references non existing license

* improve error message for task claimed by other user

* resolve second round of Copilot comments

* small code refactoring

* fixed failing test

(cherry picked from commit 40672c3)

* resolve PR comments

---------

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* [Port to dtq-dev] Issue ufal#1317 metadata health check (#1307)

* Issue ufal#1317 metadata health check (ufal#1338)

* Issue 1317: health-check for metadata - initial commit

* implement MetadataCheck report

* improve MetadataCheck

* improve MetadataCheck with more sophisticated selection of error/warning messages

* code cleanup

* rename qa-metadata-error-patterns.json to metadata-check-patterns.json

* implement copilot suggestions

* add test for Metadata check to HealthReportIT

* improved documentation

* added test

* improve documentation

* set default error dispersion quota to 5

(cherry picked from commit 39157a5)

* add JavaDoc description

* resolve coderabbitai comment

* update report-diff-fields.json

* resolve Copilot comments

---------

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* test: de-flake ItemHandleCheckerIT (mock the live handle resolver) (#1346)

* test: de-flake ItemHandleCheckerIT (mock the live handle resolver)

ItemHandleCheckerIT.testItemHandleNotFound intermittently failed on dtq-dev with
a synthetic 617 (SocketTimeoutException) instead of 404. The checkhandles task
(ItemHandleChecker) issues a HEAD request to each item's handle URL
(handle.canonical.prefix + handle) with a 3s read timeout; the test pointed the
prefix at the live http://hdl.handle.net/ and asserted the resolver returns 404
(non-existent handle) and 302->200 (a real handle). When the live resolver was
slow/unreachable the HEAD timed out -> 617 -> red pipeline. Same class of flake
as the ORCID tests; not a regression (the merge commit passed many other runs).

Fix (test-only): serve the handle URLs from a local okhttp3 MockWebServer. setUp
sets handle.canonical.prefix to the mock base URL; a path-based dispatcher returns
302 (+Location to a -target path) for the redirect handle, 200 for the target, and
404 otherwise. The real/invalid/ignored URLs are derived from the mock base instead
of hard-coded hdl.handle.net literals; the server is closed in @after. All six tests
keep their original assertions and behavior; only the live-network hop is removed.

Verified locally: 6/6 green across 4 runs, no external network involved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: restore handle.canonical.prefix in ItemHandleCheckerIT teardown

Address CodeRabbit: setUp overwrites the shared handle.canonical.prefix with the
per-run mock-server URL. Capture the previous value and restore it in destroy()
(after closing the mock server) so a later test in the same JVM is never left
pointing at the now-closed localhost port. (The superclass destroy() reloadConfig()
already resets it, but restoring explicitly makes the test self-contained.)

Re-verified locally: 6/6 green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* UFAL/Obtain special groups from user context when new token is generated (on token refresh)  (ufal#1378) (#1347)

* Issue 1373: obtain special groups from user context when new token is generated (on token refresh)

* resolve Copilot comments

* resolve Copilot Comments: compute special groups only when when user is authenticated

* Remove HttpSession dependency from ClarinShibAuthentication

Use request-scoped attributes for shib.authenticated instead of
HttpSession/JSESSIONID, aligning with upstream ShibAuthentication.
Follow-up to ufal#1373/ufal#1378.

* Guard against null special groups in Context.getSpecialGroups

A special-group UUID may reference a Group that has since been deleted;
GroupService.find returns null in that case. The list was built with an
unconditional add, so it could contain null elements, which caused an NPE
downstream (e.g. SpecialGroupClaimProvider.getValue maps group.getID()
while generating the JWT sg claim on token refresh). Filter nulls once
here so every caller is covered. Follow-up to ufal#1373/ufal#1378.

---------


(cherry picked from commit 4c294b2)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

* UFAL/fix: DOI Organizer creates duplicate dc.identifier.doi metadata (ufal#1368) (#1350)

* Issue 1361 fix: DOI Organizer creates duplicate dc.identifier.doi metadata

* copilot comments

* Issue 1361: make DOI metadata save additive, report duplicates via QA

saveDOIToObject now adds dc.identifier.doi only when that exact value is
not already present, and no longer deletes metadata. The previous fix
cleared existing values when more than one was found or when a different
DOI was present; since this method runs after the DOI has already been
registered with the external agency, silently dropping a (possibly
legacy/citable) identifier is lossy and irreversible. The operation stays
idempotent, so re-registration no longer creates duplicate values.

Items that legitimately end up with more than one dc.identifier.doi value
are now surfaced for manual review by adding dc.identifier.doi to the
ItemMetadataQAChecker noDuplicate list (metadataqa curation task) rather
than being cleaned up silently in the write path.

Tests: flip the replace test to assert a pre-existing different DOI is
preserved alongside the new one, keep the idempotency test, and add a QA
checker IT asserting an item with two DOIs fails curation.

* local field renaming

---------


(cherry picked from commit 3b7db4c)

Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>

---------

Co-authored-by: Ondřej Košarko <ko_ok@centrum.cz>
Co-authored-by: Milan Kuchtiak <kuchtiak@ufal.mff.cuni.cz>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kosarko <1842385+kosarko@users.noreply.github.com>
Co-authored-by: Paurikova2 <michaela.paurikova@dataquest.sk>
Co-authored-by: Kasinhou <129340513+Kasinhou@users.noreply.github.com>
Co-authored-by: Matus Kasak <matus.kasak@dataquest.sk>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Kim Shepherd <kim@shepherd.nz>
Co-authored-by: jurinecko <juraj.roka@dataquest.sk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

UFAL/Health report bugs and improvements

5 participants